fix(bedrock_guardrails): use Bedrock OUTPUT source for apply_guardrail when scanning model responses#26144
Conversation
… scans
BedrockGuardrail.apply_guardrail hardcoded source="INPUT" regardless of the
input_type parameter. On the non-streaming post-call path (unified_guardrail
-> OpenAIChatCompletionsHandler.process_output_response -> apply_guardrail),
the model response text was sent to Bedrock as INPUT, so guardrail policies
configured for Output (e.g. PII/NAME blocking) returned action=NONE and the
response passed through unblocked. The streaming path was unaffected because
it calls make_bedrock_api_request(source="OUTPUT", ...) directly.
Map input_type to the correct Bedrock source ("request" -> INPUT,
"response" -> OUTPUT) and build a synthetic ModelResponse for the OUTPUT
path so _create_bedrock_output_content_request produces the correct payload.
Made-with: Cursor
Greptile SummaryThis PR fixes Confidence Score: 5/5Safe to merge — the fix is narrowly scoped to the apply_guardrail OUTPUT path, is well-tested with mock-only regression tests, and addresses a real behavioural gap without touching other code paths. All findings are P2 or lower; the previous P1 concern about test coverage for the OUTPUT branch was addressed by the author (two new tests added). The fix is correct, the logic is clear, and no regressions are introduced. No files require special attention.
|
| Filename | Overview |
|---|---|
| litellm/proxy/guardrails/guardrail_hooks/bedrock_guardrails.py | Fixes apply_guardrail to use source=OUTPUT when input_type="response", building a synthetic ModelResponse for the OUTPUT path; logic is correct and well-commented. |
| tests/test_litellm/proxy/guardrails/guardrail_hooks/test_bedrock_guardrails.py | Adds two regression tests directly exercising the fixed INPUT/OUTPUT source routing; tests use mocks only (no network calls), asserting both the source param and synthetic ModelResponse shape. |
Sequence Diagram
sequenceDiagram
participant C as Caller (unified_guardrail)
participant AG as apply_guardrail
participant F as _prepare_guardrail_messages_for_role
participant BR as make_bedrock_api_request
C->>AG: apply_guardrail(inputs, input_type="response")
AG->>AG: build mock_messages (role="user")
AG->>F: _prepare_guardrail_messages_for_role(mock_messages)
F-->>AG: filtered_messages
AG->>AG: bedrock_source = "OUTPUT"
AG->>AG: build synthetic ModelResponse (choices with role="assistant")
AG->>BR: make_bedrock_api_request(source="OUTPUT", response=synthetic)
BR-->>AG: bedrock_response
AG-->>C: GenericGuardrailAPIInputs (masked/original texts)
Note over C,AG: Before fix: always used source="INPUT" causing output-only policies to return action=NONE
Reviews (4): Last reviewed commit: "Merge remote-tracking branch 'upstream/l..." | Re-trigger Greptile
…PUT source Add regression tests that mock make_bedrock_api_request and verify input_type=request uses source=INPUT with user messages, and input_type=response uses source=OUTPUT with synthetic ModelResponse. Made-with: Cursor
|
@greptile review again with new commit |
… litellm_post_call_non_streaming
|
| GitGuardian id | GitGuardian status | Secret | Commit | Filename | |
|---|---|---|---|---|---|
| 29203053 | Triggered | Generic Password | 62c2c55 | .circleci/config.yml | View secret |
🛠 Guidelines to remediate hardcoded secrets
- Understand the implications of revoking this secret by investigating where it is used in your code.
- Replace and store your secret safely. Learn here the best practices.
- Revoke and rotate this secret.
- If possible, rewrite git history. Rewriting git history is not a trivial act. You might completely break other contributing developers' workflow and you risk accidentally deleting legitimate data.
To avoid such incidents in the future consider
- following these best practices for managing and storing secrets including API keys and other credentials
- install secret detection on pre-commit to catch secret before it leaves your machine and ease remediation.
🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.
… litellm_post_call_non_streaming
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
Cause
BedrockGuardrail.apply_guardrail always called make_bedrock_api_request with source="INPUT", even when input_type="response" (post-call / model output). Bedrock guardrails often apply different policies for input vs output (e.g. PII/name rules only on output). Sending assistant text as INPUT led to action=NONE and no block.
Non-streaming completions go through unified_guardrail → OpenAIChatCompletionsHandler.process_output_response → apply_guardrail(..., input_type="response"), so they hit this bug. Streaming worked because that path already used source="OUTPUT" on the Bedrock call.
Fix
Map input_type to the Bedrock source: "request" → INPUT (messages), "response" → OUTPUT. For the OUTPUT path, build a synthetic ModelResponse whose choices carry the text to scan, and call make_bedrock_api_request(source="OUTPUT", response=synthetic_response, ...) so Bedrock evaluates output policies and blocks consistently with streaming.